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Abstract. Let A be an n by Af real valued random matrix, and H^ denote 
the Af-dimensional hypercube. For numerous random matrix ensembles, the 
expected number of fc-dimensional faces of the random n-dimensional zono- 
tope AH''^ obeys the formula Ef^{AH^)/fkiH^) = 1 - P^.^^N-k, where 
Pf^—ri N—k is a fair-coin-tossing probability: 

PN—n.N—k = ProbjAf — A: — 1 or fewer successes in A'' — n — 1 tosses }. 

^ ^ The formula applies, for example, where the columns of A are drawn i.i.d. 

V i _/ 1 from an absolutely continuous symmetric distribution. The formula exploits 

Wendel's Theorem 1 191 . 

Let R^ denote the positive orthant; the expected number of fc-faces of 
(-| i the random coneTlR^ obeys £/fe(AR^)//fe(R^) = 1 - Pjv_„_jv-fc- The for- 

mula applies to numerous matrix ensembles, including those with iid random 
columns from an absolutely continuous, centrally symmetric distribution. 

The probabilities Pjv— n N—k change rapidly from nearly to nearly 1 near 

k ^ 2n — N. Consequently, there is an asymptotically sharp threshold in 

the behavior of face counts of the projected hypercube; thresholds known for 

projecting the simplex and the cross-polytope, occur at very different locations. 

^ ' We briefly consider face counts of the projected orthant when A does not have 

^_^ , mean zero; these do behave similarly to those for the projected simplex. We 

ON 1 consider non-random projectors of the orthant; the 'best possible' A is the one 

'•fi ' associated with the first n rows of the Fourier matrix. 

C^ ' These geometric face-counting results have implications for signal process- 

>»^^ , ing, information theory, inverse problems, and optimization. Most of these 

A^^ ■ flow in some way from the fact that face counting is related to conditions for 



C^ 



oo 



uniqueness of solutions of underdetermined systems of linear equations. 
^»s ' a) A vector in M.^ is called fc-sparse if it has at most k nonzeros. For such a 

fc-sparse vector xq, let b = Axq, where A is a random matrix ensemble covered 
by our results. With probability 1 — Ppf—n N—k the inequality-constrained 
1 J - system Ax = b, x > has xq as its unique nonnegative solution. This is so, 

I^N^ ' even if n < A'^, so that the system Ax = 6 is underdetermined. 

^ b) A vector in the hypercube H^ will be called fc-simple if all entries except 

at most k are at the bounds or 1. For such a fc-simple vector xg, let b = Axg, 
where A is a random matrix ensemble covered by our results. With probability 
1 — Pn-ti N—k the inequality-constrained system Ax = b, x £ H^ has xq as 
its unique solution in the hypercube. 



cd 



Date: May 2008. 

2000 Mathematics Subject Classification. 52A22, 52B05, 52B11, 52B12, 62E20, 68P30, 68P25, 
68W20, 68W40, 94B20 94B35, 94B65, 94B70. 

The authors would like to thank the Isaac Newton Mathematical Institute at Cambridge Uni- 
versity for hosting the programme "Statistical Challenges of High Dimensional Data" in 2008. 
and Professor D.M. Titterington for organizing this programme. DLD acknowledges support from 
NSF DMS 05-05303 and a Rothschild Visiting Professorship at the University of Cambridge. 

JT acknowledges support from the Alfred P. Sloan Foundation and thanks John E. and Marva 
M. Warnock for their generous support in the form of an endowed chair. 

1 



Keywords. Zonotopc, Random Polytopes, Random Cones, Wendel's Theo- 
rem, Threshold Phenomena, Universahty, Random Matrices, Compressed Sensing, 
Unique Solution of Underdetermined Systems of Linear Equations. 



1. Introduction 

There are 3 fundamental re(7uZar polytopes in M.^ , N > 5: the hypercube H^ , the 
cross-polytope C^, and the simplex T^^^. For each of these, projecting the vertices 
into K", n < N, yields the vertices of a new polytope; in fact, every polytope in i?" 
can be generated by rotating the simplex T^~^ and orthogonally projecting on the 
first n coordinates, for some choice of N and of iV-dimensional rotation. Similarly, 
every centro-symmetric polytope can be generated by projecting the cross-polytope, 
and every zonotope by projecting the hypercube. 

1.1. Random polytopes. Choosing the projection A at random has become pop- 
ular. Let ^ be an n X iV uniformly distributed random orthogonal projection, 
obtained by first applying a uniformly-distributed rotation to M^ and then pro- 
jecting on the first n coordinates. Let Q be a polytope in M^. Then AQ is a 
random polytope in R" . Taking Q in turn from each of the three families of regular 
polytopes we get three arenas for scholarly study: 

• Random polytopes of the form AT^~^ were first studied by Affentranger 
and Schneider [1] and by Vershik and Sporyshev [18]; 

• Random polytopes of the form AC^ were first studied extensively by 
Borozcky and Hcnk 5J; 

• The random zonotope AH^ will be heavily studied in this paper; begin- 
nings of a literature on zonotopes can be found in [4] [2] . 

Such random polytopes can have face lattices undergoing abrupt changes in prop- 
erties as dimensions change only slightly. In the case of AT^""^ and AC^ , previous 
work by the authors [71 |TOl [131 [12] documented the following threshold phenomenon. 
(Our work built on fundamental formulas developed by Affentranger and Schneider 
[T] and used an asymptotic framework pioneered by Vershik and Sporyshev [18j . 
who pointed to the first such threshold effect). Let fk{Q) denote the number of 
fc-dimensional faces of polyhedron Q. It turns out that for large n, the number of 
fc-dimensional faces of fk (AQ) might either be approximately equal to fk {Q) or else 
significantly smaller, depending on the size of k relative to a threshold depending 
on the ratio of n to N. 

To make this precise, consider the following proportional- dimensional asymptotic 
framework. A dimension specifier is a triple of integers {k,n,N), representing a 
'face' dimension k, a 'small' dimension n and a 'large' dimension N; k < n < N. 
For fixed S, p G (0, 1), consider sequences of dimension specifiers, indexed by n, and 
obeying 

(1.1) kn/n -^ p and n/Nn -^ S. 

For such sequences the small dimension n is held proportional to the large dimension 
N as both dimensions grow. We omit subscripts on fc„ and Nn when possible. For 
Q = T^-\ C^, the papers [T] [10] [131 [E] exhibited thresholds p{5; Q) for for the 
ratio between the expected number of faces of the low-dimensional polytope AQ 
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and the number of faces of the high-dimensional polytope Q: 

£h{AQ) / =1 P<pw{5;Q) 



(^■^) }Tc. f,{Q) \ <1 p>pw{S;Q) 

(In this relation, we take a limit as Ji —f oo along some sequence obeying the 
proportional-dimensional constraint ()1.H) ). In words, the random object AQ has 
roughly as many fc- faces as its generator Q, for k below a threshold; and has no- 
ticeably fewer fc-faces than Q, for k above the threshold. The threshold functions 
are defined in terms of Gaussian integrals and other special functions, and can be 
calculated numerically. 

These phenomena, described here from the viewpoint of combinatorial geometry, 
have surprising consequences in probability theory, information theory and signal 
processing; see [HIIIIIIS], and Section [5] below. 

1.2. Random Zonotopes. Missing from the above picture is information about 
the third family of regular polytopes, the hypercube. Boroczky and Henk [5] dis- 
cussed it in passing, but only considered the asymptotic framework where the small 
dimension n is held fixed while the large dimension N —>■ oo. In that framework, 
the threshold phenomenon is not visible. In this paper, we again consider the 
proportional-dimensional case (|l.ip and prove the following. 

Theorem 1.1 ('Weak' Threshold for Hypercube). Let 

(1.3) pvK(<5;ii^^) :=max(0,2-(S-i). 

For p,S in (0, 1), consider a sequence of dimension specifiers {k, n, N) obeying \1.1]) . 
Let A denote a uniformly-distributed random orthogonal projection from K^ to M". 

£fu{AH^) _ f 1, p<pw{5.H^) 
fk{H^) I 0, p>pw{5,H^) ■ 

Thus we prove a sharp discontinuity in the behavior of the face lattices of random 
zonotopes; the location of the threshold is precisely identified. (Such sharpness of 
the phase transition is also observed empirically for (|1.2p above; to our knowledge, 
a proof of discontinuity has not yet been published in that setting. ) Our use of 
the modifier 'weak' and the subscript W on p matches usage in the previous cases 
T^-i and C^. 

Although this result has been stated in the language of combinatorial convexity, 
as with the earlier results for AT^~^ and AC^ ^ there are implications for applied 
fields including optimization and signal processing, see Section [5] below. 

1.3. More General Notion of Random Projection. In fact, Theorem ll.il is 
only the tip of the iceberg. The ensemble of random matrices used in that result 
- uniformly distributed random orthoprojector - is only one example of a random 
matrix ensemble for which the conclusion (|1.4p holds. As it turns out, what really 
matters are the statistical properties of the nuUspace of A. 

Definition 1.2 (Orthant-Symmetry). Let _B be a random A^ — n by A^ matrix 
such that for each diagonal matrix S with diagonal in { — 1,1}^, and for every 
measurable set Q., 

ProbjBS' e rj} = Prob{S G Vl}. 

Then we say that B is an orthant- symmetric random matrix. Let Vb be the linear 
span of the rows of B. If B is an orthant-symmetric random matrix we say that V 
is an orthant-symmetric random, subspace. 



(1.4) lim 
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Remark 1.3 (Orthant-Symmetric Ensembles). The following ensembles of random 
matrices are orthant-symmetric: 

• Uniformly- distributed Random orthoprojectors from R^ to R^~"; implicitly 
this was the example considered earlier. 

• Gaussian Ensembles. A random matrix B with entries chosen from a Gauss- 
ian zero-mean distribution, i.e. such that the (N — n) ■ A^-element vector 
vec{B) is A'^(0, S) with S a nondegenerate covariance matrix. 

• Symmetric i.i.d. Ensembles. Matrices with entries sampled i.i.d. from 
a symmetric probability distribution; examples include Gaussian A'^(0, 1), 
uniform on [—1, 1], uniform from the set {—1, 1}, and from the set { — 1, 0, 1} 
where —1 and 1 have equal non-zero probability. 

• Sign Ensembles. For any fixed generator matrix Bq, let the random ma- 
trix B = BqS where S* is a random diagonal matrix with entries drawn 
uniformly from { — 1, 1}. 

New orthant-symmetric ensembles can be created from an existing one by mul- 
tiplying on the left by an arbitrary random matrix T which is stochastically inde- 
pendent of B, and multiplying on the right by a random diagonal matrix R also 
stochastically independent of B and T: thus B' — TBR inherits orthant symmetry 
from B. 

Definition 1.4 (General Position). Let i? be a random TV — n by A^ matrix such 
that every subset oi N — n columns is almost surely linearly independent. Let Vb 
be the linear span of the rows of B. We say that Vb is a generic random subspace. 

Many orthant-symmetric ensembles from our list create generic row spaces: 

• Uniformly-distributed random orthoprojectors; 

• Gaussian Ensembles; 

• Symmetric iid ensembles having an absolutely continuous distribution; 

• Sign Ensembles with generator matrix Bq having its columns in general 
position; 

Define a censored symmetric iid ensemble as a symmetric iid ensemble from which 
we discard realizations B where the columns happen to be not in general position. 
Censoring a symmetric iid ensemble made from the Bernoulli { — 1, +1} coin tossing 
distribution produces a new random matrix model B whose realizations are in 
general position with probability one. (The probability of a censoring event is 
exponentially small in N, [17]). 

Theorem 1.5 ('Weak' Threshold for Hypercube ). Let the random matrix A have 
a random nullspace which is orthant symmetric and generic. In the proportional- 
dimensional framework fl.l]) the random zonotope AH^ obeys the same conclusion 
\1.4^ as in Theorem \l.l\ 



In a sense, this theorem extends the conclusion of Theorem 11.11 to vastly more 
cases . It has been previously observed that some results known for the Goodman- 
Pollack random orthoprojector model actually extend to other ensembles of ran- 
dom matrices. It was observed for the simplex by Affentranger and Schneider [1], 
and proven by Baryshnikov and Vitale [31 [2] , that face-counting results known for 
uniformly-distributed random orthoprojectors follow as well for Gaussian iid ma- 
trices A. 



FACES OF THE RANDOMLY-PROJECTED HYPERCUBE AND ORTHANT 



Our extension of Theorem 11.11 from orthoprojectors to orthant-symmetric null 
spaces in Theorem 11.51 follows this program. However, it is a vastly larger extension. 



1.4. Random Cone. Convex cones provide another type of fundamental polyhe- 
dral set. Amongst these, the simplest and most natural is the positive orthant 
P = M.^. The image of a cone under projection A: M.^ -^ R" is again a cone 
K = AP. Typically the cone has fa{K) — 1 vertex (at 0), and fi{K) — N extreme 
rays, etc. In fact, every such pointed cone in M" can be generated as a projec- 
tion of the positive orthant, with an appropriate orthogonal projection from an 
appropriate M^. 

As with the polytopes models, surprising threshold phenomena can arise when 
the projector is random. 

Theorem 1.6 ('Weak' Threshold for Orthant). Let A he a random matrix whose 
nullspace is an orthant-symmetric and generic random subspace. In the proportional- 
dimensional framework lil.l]) we have 

5/fc(AM^) _ r 1, p<pyy{S;T^^^ 



with pwiS;R^) = pw{5]H^) as defined in kl.3\) . 

Here the threshold for the orthant is at precisely the same place as it was for 
the hypercube. Theorem 11.61 is proven in Section 12.31 ^-nd there are significant 
implications in optimization and signal processing briefly discussed in Section [5] 

1.5. Exact equality in the number of faces. Our focus in Sections 11.1111.41 
has been on the 'weak' agreement of £fk{AQ) with fk{Q)', we have seen in the 
proportional-dimensional framework, for p below threshold pwi^', Q), we have lim- 
iting relative equality: 

; — KT. > 1, n ^ oo. 

We now focus on the 'strong' agreement; it turns out that in the proportional di- 
mensional framework, for p below a somewhat lower threshold psi^', Q), we actually 
have exact equality with overwhelming probability: 

(1.6) ProhifkiQ) = fkiAQ)} ^ 1, n ^ oo. 

The existence of such 'strong' thresholds for Q — T^~^ and Q — C^ was proven in 
[7l[Tn], which exhibited thresholds ps{6\ Q) below which (|1.6p occurs. These "strong 
thresholds" and the previously mentioned "weak thresholds" (jl.2p are depicted in 
Figure [3Tl A similar strong threshold also holds for the projected orthant. 

Theorem 1.7 ('Strong' Threshold for Orthant). Let 

(1.7) H{-i) := 7 log(l/7) - (1 - 7) log(l - 7) 
denote the usual (base-e) Shannon Entropy. Let 

(1.8) ^^(5,P) := H{6)+5H{p) - (1 - p<5)log2. 

For 5 > 1/2, let ps{5\R^) denote the zero crossing ofipg^{S,p). In the proportional- 
dimensional framework il.l]) with p < ps{S;R^) 

(1.9) Fro&{/fc(^K+) = /fe(K+)} -> 1, as n ^ ^. 
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The threshold pw{^',Q) for Q = M^ and H^ , and ps{S;R^) are depicted in 
Figure 11.11 




Figure 1.1. The 'weak' thresholds, pw{5\H^) and pw(^;R+) 
(black) , and a lower bound on the strong threshold for the positive 
orthant, ps(<5;R^) (blue). 



In contrast to the projected simplex, cross-polytope, and orthant, for the hyper- 
cube, there is no nontrivial regime where a phenomenon like ()1.6p can occur. 



Lemma 1.8 (Zonotope Vertices). Let A be an n x N matrix, and let H^ he the 
N dimensional hypercube. 



N\ 



MAH'')<fkiH''), fc = 0,1,2, 



Proof of Theorem \1.8[ In fact, we will show that AH^ always has fewer than 2^ 
vertices. This immediately implies the full result. There exists a w G ■N'{A) with 
w 7^ 0. H^ has a vertex xq obeying 



xa{i) 



sgn(w(i)) > 0, 

1 else. 



Let Xt := x + tw with t > 0. For t sufRciently small xt is in the interior of H^ , and 
by construction Axq — Axt- Invoking Lemma [231 ^o is not a vertex of AH^ , and 
foiAH^) < foiH^). □ 

Although this proof only highlights a single vertex of iJ^ that is interior to 
AH'^ , it is clear from its construction that there are typically many such lost 
vertices. Theorem 11.71 is proven in Section [2.51 



1.6. Exact Non- Asymptotic Results. We have so far exclusively used the Vershik- 
Sporyshev proportional-dimensional asymptotic framework; this makes for the most 
natural comparisons between results for the three families of regular polytopes. 
However, for the positive orthant and hypercube, something truly remarkable hap- 
pens: there is a simple exact expression for finite N which connects to a beautiful 
result in geometric probability. 
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Theorem 1.9 (Wendel, [T9l). Let M points in R™ be drawn i.i.d. from a centro- 
symmetric distribution such that the points are in general position, then the proba- 
bility that all the points fall in some half space is 

m — l / It r -1 

(1.10) PrnM-2-^^+^Y.[ ; 

i=0 ^ 

This elegant result is often presented as simply a piece of recreational mathe- 
matics. In our setting, it turns out to be truly powerful, because of the following 
identity. 

Theorem 1.10. Let A be an n x N random, matrix with an orthant- symmetric and 
generic random nullspace. 

SfkiAR'l) 

(1.11) /,(Rf) -1-^^-^^A^— 

Symmetry implies a similar identity for the hypercube: 

Theorem 1.11. Let A be a random matrix with an orthant- symmetric and generic 
random nullspace. 

(1.1^) fk{H'^) -^-PN-n,N-k- 

These formulae are not at all asymptotic or approximate. But all the earlier 



asymptotic results derive from them. Theorem 11.101 is proven in Section 12.11 and 
the symmetry argument for Theorem 11.111 is formalized in Lemma 12.61 and proven 
in Section! 



1.7. Contents. Theorem 11.61 is proven in Section [2731 Theorems 11.11 and 11.51 are 
proven in Section [2. 4) and Theorem 1 1.71 is proven in Section[2T5l each using the clas- 
sical Wendel's Theorem ji9j, Theorem 1 1.9 1 Their relationships with existing results 
in convex geometry and matroid theory are discussed in Section [31 the the impli- 
cations of these results for information theory, signal processing, and optimization 
are briefly discussed in Section [5] 

2. Proof of main results 

Our plan is to start with the key non-asymptotic exact identity (jl.lip and then 
derive from it Theorem ll.6l bv asymptotic analysis of the probabilities in Wendel's 
Theorem. We then infer Theorem 1 1.51 and later Theorem 11.71 follows in Section [2751 



2.1. Proof of Theorem 11.101 Here and below we follow the convention that, if 
we don't give the proof of a lemma or corollary immediately following its statement, 
then the proof can be found in Section [Sj 

Our proof of the key formula (jl.lip starts with the following observation on the 
expected number of /c-faces of M.^. 

(2.1) ■' ) +^ = Avei. [PToh{AF is a fc-face of AR^}] . 

Here Avep denotes "the arithmetic mean over all fc- faces of M.^. 

Because of (|2.ip we will be implicitly averaging across faces below. As a calcu- 
lation device we suppose that all faces are statistically equivalent; this allows us to 
study one fc-face, and yet compute the average across all fc-faces. 
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Definition 2.1 (Exchangable columns). Let A be a random n by A^ matrix such 
that for each permutation matrix H, and for every measurable set fl, 

Prob{A en} = ProbjAn e n} 

Then we say that A has exchangeable columns. 

Below we assume without loss of generality that A has exchangeable columns. 
Then (12.11) becomes: let F be a fixed fc-face of Ri'; then 



(2.2) ■"'] +' = FrobjAi^ is a fc-face of AK^}. 

Let P be a polytope in R^ and xq G P. The vector w is a feasible direction for 
P at xq \i xq + tv e P for all sufficiently small i > 0. Let Feas^^j, (P) denote the set 
of all feasible directions for P at xq. 

Lemma 2.2. Let x^ be a vector in M.^ with exactly k nonzeros. Let F denote the 
associated k-face of M.^ . For an n x N matrix A, let AF denote the image of F 
under A. The following are equivalent: 

(SurvivefA, F, M.^)): AF is a k-face of AR^ , 

(Transverse(A,xo,Rl)) Af{A) n Feas^„{R^) = {0}. 

We now develop the connections to the probabilities in Wendel's theorem. 

Lemma 2.3. Let xq G R;'^ have k nonzeros. Let A be n x N with n < N have an 
orthant- symmetric null space with exchangeable columns. Then 



Prob{{Transverse(A, xo^R^l)) Holds} = 1 - Pn- 



.N-k 



Proof. Exchangeability of the columns implies that 

Prob{(Transverse(A, xq, R+ )) Holds} 

does not depend on xq , but only on the number of nonzeros in xq and the size of 
A. Therefore, let k be the number of nonzeros in xq, and set 

TTk,n,N = Prob{(Transverse(A, a:oiR+)) Holds}. 

The matrix A has its columns in general position. Therefore we may construct 
a basis bi for its null space, Af{A), having exactly N ~ n basis vectors. The N by 
N — n matrix B^ having the bi for its columns generates every vector w in Af{A) 
via a product of the form w — B^c, where c G R^~". 

Without loss of generality, suppose the nonzeros of xq are in positions i = 
N - k + l,...,N. Then Feas^o(R^) = {v : vi,... VN^k > 0}. Condition 
(Transverse(A,a;o,R:'J^)) can be restated as 

{The only vector c satisfying 
(B'^c)^>0, i = l,...,N-k, 
is the vector c = 0. 

Suppose the contrary to (Ineq), i.e. suppose there is a c 7^ solving (|2.3|) . Let 
now Pi denote the i-th row of B^ , with i = 1, . . . ,N — k. Then (|2.3p is the same as 

/3, •c>0, i = l,...,N -k. 

Geometrically, this says that 

Each vector Pi, i — 1, . . . ,N — k, 
falls in the half-space /3 • c > 0. 
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Here c is some fixed but arbitrary nonzero vector. Thus the event {(Ineq) does not 
hold} is equivalent to the event 

All the vectors Pi with i = 1, . . . ,N — k 

fall in some half-space of R^~". 
By our hypothesis, the vectors (3i with i ~ 1, . . . , N — k are drawn i.i.d. from a 
centrosymmetric distribution and are in general position. We now invoke Wendel's 
Theorem, and it follows that 

1^k,n,N = 1 — PN-n,N-k- 

D 

2.2. Some Generalities about Binomial Probabilities. The probability Pm.M 
in Wendel's theorem has a classical interpretation: it gives the probability of at most 
TO — 1 heads in M — 1 tosses of a fair coin. The usual Normal approximation to the 
binomial tells us that 

^ ( {m-l)-{M-l)/2 \ 

with <I> the usual standard normal distribution function $(x) = /^ e~^ /^dy/-\/27r; 
here the approximation symbol « can be made precise using standard limit theo- 
rems, eg. appropriate for small or large deviations. In this expression, the approx- 
imating normal has mean {M — l)/2 and standard deviation \J{M — l)/4. There 
are three regimes of interest, for large m, M, and three behaviors for Pm.M- 

• Lower Tail: to <C M/2 - y/M/4. P^.m ~ 0. 

• Middle: to w Af/2. P^m e (0, 1). 



• Upper Tail: to > M/2 + ^JM/A. P^m ~ 1- 

2.3. Proof of Theorem 11.61 Using the correspondence N— n^^m, N— k^^ M, 
and the connection to Wendel's theorem, we have three regimes of interest: 

• iV-n< [N -k)/2 

• N -n^ (N -k)/2 

• N -n:$> [N -k)/2 

In the proportional-dimensional framework, the above discussion translates into 
three separate regimes, and separate behaviors we expect to be true: 

• Case I: p < pw{S]H^). PN^-n.N^-k^ -^ 0. 

. Case 2: p = pwiS^H^^)- PN^-n,N^-k„ e (0,1). 

• Case 3 p> pw{5\H^). PN„-n,Nr,-k„ -> 1- 

Case 2 is trivially true, but it has no role in the statement of Theorem ll.6l Cases 
1 and 3 correspond exactly to the two parts of (jl.Sp that we must prove. 

To prove Cases 1 and 3, we need an upper bound deriving from standard large- 
deviations analysis of the lower tail of the binomial. 

Lemma 2.4. Let N - n < {N - k)/2. 



n k 



(2.4) PN-n,N-k < n'/' exp (n^^ 

where the exponent is defined as 

(2.5) V^+ {6, p) :- H{5) + 5H{p) - H{p6) - [l - p6) log 2 
with H{-) the Shannon Entropy |_?.?]) 
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N-n 



Proof. Upperbounding the sum in PN-n,N-k hy N — n — 1 times ( ^r „ ) we 
arrive at 

,2.) P„.„,„_,,2—.^^(« -..:,(-)(»)(» ■ 



We can bound ( "^) for 7 < 1 using the Shannon entropy (|1.7I 

(2.7) r,«-V^p™«t7j < ( ■■" \ < r.^rnHij) 






where ci := ^•\/2/7r, C2 := 5/4v27r. Recalhng the definition of V'vi^i '^6 obtain 



We wiU now consider Cases 1 and 3, and prove the corresponding conclusion. 

Case 1: p < pw{5]'SJl). The threshold function /9vi/(^;K:^) is defined as the zero 
level curve V'v^l^, Pw{5] 1^+)) — 0; thus for any p strictly below pw{S\ R+ ), the ex- 
ponent ijj^ {5,p) is strictly negative. Lemma |2 .41 thus implies that PN„-n,N„-k„ -^ 
as n — > 00. 



Weak exponent V\»,(p.S) 




Figure 2.1. Exponent for the weak phase transition, tfjy^)^ (p, S) , 
(12. Sp . which has its zero level curve at pw{S;^+), equation (|1.3p . 
The projected hypercube has the same weak phase transition and 
exponent ipfl- = ip 



'^w ■ 



Case 3: p > pw{S]^^). Binomial probabilities have a standard symmetry 
(relabel every 'head' outcome as a 'tail', and vice versa). It follows that Pm,M = 
1-PM-mM- WehavePAr_fc,Ar_„ = l--PAr_fc,„_fe. In this case ^ - n > {N~k)/2, 



so Lemma 

as n ^ 00. 



tells us that PN-k,n-k — > as n — > cx); we conclude Pat- 



k,N-r. 



2.4. Proofs of Theorems 11.11 and 11.51 We derive the exact non-asymptotic re- 
sult Theorem 11.111 from Theorem 1 1 . 1 01 bv symmetry. The limit results in Theorems 
11.11 and 11.51 follow immediately from asymptotic analysis of Section 12.31 
We begin as before, relating face counts to probabilities of survival. 

£fk{AH^' 



(2.8) 



fkiH 



N\ 



= Avcf [ProbjAi^ is a /c-face of AiJ^}] . 
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Here AveF denotes the average over fc- faces of H^ . 

As before, we assunie exchangeable columns as a calculation device, allowing us 
to focus on one fc-face, but compute the average. Under exchangeability, for any 
fixed fc-face F, 






(2.9) 7) ^ = Proh{AF is a fc-face of AH^} 



We also again reformulate matters in terms of transversal intersection. 

Lemma 2.5. Let xq be a vector in H^ with exactly k nonzeros. Let F denote 
the associated k-face of H^ . For an n x N matrix A the following are equivalent: 

(Survive(A, F, H^ )): AF is a k-face of AH^ , 

(Transverse(A, xo,H^)): J\f{A) n Feas^,„{H^) ^ {0}. 

We next connect the hypercube to the positive orthant. Informally, the point is 
that the positive orthant in some sense shares faces with the "lower faces" of the 
hypercube. 

Formally, let xo be a vector having x{i) = 0,1 < i < N — k — 1, and x{i) = 1/2, 
N — k < i < N. Then xq belongs to both H^ and M.^. It makes sense to define 
the two cones FeaSx^iH^) and Feasxgi^^) for this specific point xq, and we 
immediately see 

FeaSxoiH^) ^ Feasxoi^'t)- 
In fact this equality holds for all xq in the relative interior of the fc-face of H^ 
containing xq. We conclude: 

Lemma 2.6. Let Fk^H be the k-dimensional face of H'^ consisting of all vectors x 
withxii) = 0,1 < J < iV-fc-1, andQ < x{i) < I, N-k <i< N. Let FkM+ be the 
k-dimensional face ofR^ consisting of all vectors x with x{i) = 0, 1 < i < iV — fc— 1, 
and < x{i), N -k<i< N. Then 

(2.10) Prob{AFkM «s a k-face of AH^} = Prob{AFkM+ is a k-face o/^Rf }. 

Combining (|2.8p and Lemma l^TBl we obtain the non-asymptotic Lemma fl . 1 1 I from 
the corresponding non- asymptotic result for the positive orthant. 

2.5. Proof of Theorem II . 71 PN-n,N-k is the probability that one fixed fc-dimensional 
face F of M;^ generates a fc-face AF of AR:!J^. The probability that some k- 
dimensional face generates a fc-face can be upperbounded, using Boole's inequality, 

by/fc(R^)-PAr_„,Ar_fc. 

From (I13D, (Ell), and /fc(K^) = (^) we have 

/fe(M^) • PN-n,N-k < n3/2exp(iVVsM'5«,Pn)) 
where '05^ was defined earlier in p.Sp . as 

(2.11) 1/;^+ (5, p) := H{S) + SH{p) - {I - p6) log 2. 

Recall that for d > 1/2, ps{5;M.^) is the zero crossing of ^5^. For any p < 
ps((5;R:^) we have tfig^{5,p) < and as a result (jl.Op follows. 

3. Contrasting the Hypercube with Other Polytopes 



The theorems in Section [T] contrast strongly with existing results for other poly- 
topes. 
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3.1. Non-Existence of Weak Thresholds at S < 1/2. Theorem 11.51 identifies a 
region of {^, ■^) wliere tlie typical random zonotopc has nearly as many fc- faces as 
its generating hypercube; in particular, if n < N/2, it has many fewer fc-faces than 
the hypercube, for every k. This behavior at n/N < 1/2 is quite different from 
the behavior of typical random projections of the simplex and the cross-polytope. 
Those polytopes have fk{AQ) « fk{Q) for quite a large range of k even at relatively 
small values of fc/n, [13], see Figure [3T] 




Figure 3.1. Weak thresholds for the simplex, pwiS;T^~^) 
(black-dash), and cross-polytope, pw{S;C^) (black-solid). Con- 
sider sequences obeying the proportional-dimensional asymptotic 
with parameters 6, p. For {6, p) below these curves, and for large 
n, each projected polytope has nearly as many fc-faces as its gen- 
erator; above these curves the projected polytope has noticeably 
fewer. Strong thresholds for the simplex, ps{5\ T^^i) (blue-dash), 
and cross-polytope, ps{S;C^) (blue-solid). For {S,p) below these 
curves, and for large n, each projected polytope and its generator 
typically have exactly the same number of fc-faces. 



3.2. Non-Existence of Strong Thresholds for Hypercube. Lemma [1781 shows 
that projected zonotopes always have strictly fewer fc-faces than their generators 
fk{AH^) < fk{H^), for every n < N. this is again quite different from the 
situation with the simplex and the cross-polytope, where we can even have n <^ N 
and still find fc for which fk{AQ) = /fc(Q), [Hj, see Figure O 

3.3. Universality of weak phase transitions. For Theorems 1 1.1 1 and 1 1.5i A can 
be sampled from any ensemble of random matrices having an orthant-symmetric 
and generic random null space. Our result is thus universal across a wide class of 
matrix ensembles. 

In proving weak and strong threshold results for the simplex and cross-polytope, 
we required A to either be a random ortho-projector or to have Gaussian iid entries. 
Thus, what we proved for those families of regular polytopes applies to a much more 
limited range of matrix ensembles than what has now been proven for hypercubes. 

Our empirical studies suggest that the same ensembles of matrices which 'work' 
for the hypercube weak threshold also 'work' for the simplex and cross-polytope 
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thresholds. It seems to us that the universahty across matrix ensembles proven here 
may point to a much larger phenomenon, valid also for other polytope families. For 
our empirical studies see |14| . 

In fact, even in the hypercube case, the weak threshold phenomenon may be 
more general than what can be proven today; it seems also to hold for some matrix 
ensembles that may not have an orthant-symmetric null space. 

4. Contrasting the Cone with the Hypercube 

The weak Cone threshold depends very much more delicately on details about 
A than do the hypercube thresholds; it really makes a difference to the results if 
the matrix A is not 'zero-mean'. 

4.1. The Low- Frequency Partial Fourier Matrix. Consider the special partial 
Fourier matrix made only of the n lowest frequency entries. 



Corollary 4.1. Assume n is odd and let 

(4.1) n,, = 

Then 



cos( -(^-^^<'-^^ ) * = l,3,5,...,n 
sin(l(2_i)i) z = 2,4,6,...,n- 



/fc(™^) = /fc(R^), fc = 0, 1, . . . , ^(n - 1). 

This behavior is dramatically different than the case for random A of the type 
considered so far, and in some sense dramatically better. 

Corollary 14.11 is closely connected with the classical question of neighborliness. 
There are famous polytopes which can be generated by projections AT^^^ and 
have exactly as many k-iaces as T^^i for k < [n/2j. A standard example is 
provided by the matrix n defined in gl]); it obeys /fe(17T^-i) = /^(T^-i), < 
k < [n/2j . (There is a vast literature touching in some way on the phenomenon 
/fc(f7r^~i) = /fc(T^-i). In that literature, the polytope flT'^-^ is usually called a 
cyclic polytope, and the columns of fi are called points of the trigonometric moment 
curve; see standard references [HI [20]). 

Hence the matrix Q offers both fki^T^^^) = h{T^'^) and /fe(f7Rf ) = /^(K^) 
for < fc < [n/2j . This is exceptional. For random A of the type discussed in 
earlier sections, there is a large disparity between the sets of triples (fc, n, N) where 
fk{AT^-^) = fk[T^~^) ~ this happens for k/n < ps{n/N;T^-^) - and those 
where /fc(^M^) = fk{^+) - this happens for k/n < ps{n/N;R^). These two 
strong thresholds are displayed in Figures 13.11 and 11.11 respectively. 

Even if we relax our notion of agreement of face counts to weak agreement, the 
collections of triples where fkiAT'^-^) « fk{T^~^) and fk{Am.f) w hi^l) are 
very different, because the two curves pw{n/N;T^~^) and pw{n/N\MJ^) are so 
dramatically different, particularly at n < iV/2. 

4.2. Adjoining a Rov^r of Ones to A. An important feature of the random ma- 
trices A studied earlier is that their random nuUspace is orthant symmetric. In 
particular, the positive orthant plays no distinguished role with respect these ma- 
trices. On the other hand, the partial Fourier matrix 17 constructed in the last 
subsection contains a row of ones, and thus the positive orthant has a distinguished 
role to play for this matrix. Moreover, this distinction is crucial; we find empirically 
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that removing the row of ones from Q causes the conclusion of Corollarv l4.1l to fail 
drastically. 

Conversely, consider the matrix A obtained by adjoining a row of N ones to 
some matrix A: 

Adding this row of ones to a random matrix causes a drastic shift in the strong and 
weak thresholds. The following is proved in Section [6] 

Theorem 4.1. Consider the proportional- dimensional asymptotic with parameters 
S, p in (0, 1). Let the random 7i — 1 by N matrix A have iid standard normal entries. 
Let A denote the corresponding n by N matrix whose first row is all ones and whose 
remaining rows are identical to those of A. Then 

U2) EhjA^l) r 1, p<p^{5,T^~-) 

(4.3) y^^Pihm) - UK)) -{I ; < ^fs^rA . 

Note particularly the mixed form of this relationship. Although the conclusions 
concern the behavior of faces of the randomly-projected orthant, the thresholds are 
those that were previously obtained for the randomly-projected simplex. 

Since there is such a dramatic difference between p{S,T^~^) and p{S,M.^), the 
single row of ones can fairly be said to have a huge effect. In particular, the 
region 'below' the simplex weak phase transition pwiSjT'^^^) comprises « 0.5634 
of the ((5, p) parameter area, and the hypercube weak phase transition pw{S, H^) 
comprises 1 — log 2 « 0.3069. 

5. Application: Compressed Sensing 

Our face counting results can all be reinterpreted as statements about "simple" 
solutions of underdetermined systems of linear equations. This reinterpretation al- 
lows us to make connections with numerous problems of current interest in signal 
processing, information theory, and probability. The reinterpretation follows from 
the two following lemmas, which are restatements of Lemmas 12.21 and 12.51 rephras- 
ing the notion of (Transverse(A, xqj Q)) with the all but linguistically equivalent 
(Umque{A,XQ,Q)). For proofs of Lemmas 15.11 and 15.21 see the proofs of Lemmas 
[O and [231 

Lemma 5.1. Let xq be a vector in M.^ with exactly k nonzeros. Let F denote the 
associated k-face ofM.^. For an n x N matrix A, let AF denote the image of F 
under A and bo = Axq the image of xq under A. The following are equivalent: 

(Survive(A, F, R'^)): AF is a k-face of AR^, 

(Unique(A, xo,M.^ )): The system bo — Ax has a unique solution in R^. 

Lemma 5.2. Let xq be a vector in H^ with exactly k entries strictly between the 
bounds {0, 1}. Let F denote the associated k-face of H . For an n x N matrix A, 
let AF denote the image of F under A and bo — Axq the image of xo under A. 
The following are equivalent: 

(Survive(A, F, H^)): AF is a k-face of AH^ , 

(Unique(A,xo, H^ )): The system bo = Ax has a unique solution in H^ . 
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Note that the systems of hnear equations referred to in these lemmas are under- 
determined: n < N. Hence these lemmas identify conditions on underdetermined 
system of linear equations, such that, when the solution is known to obey certain 
constraints, there are many cases where this seemingly weak a priori knowledge in 
fact uniquely determines the solution. The first result can be paraphrased as saying 
that nonnegativity constraints can be very powerful, if the object is known to have 
relatively few nonzeros; the second result says that upper and lower bounds can be 
very powerful, provided those bounds are active in most cases. 

These results provide a theoretical vantage point on an area of recent intense 
interest in signal processing, appearing variously under the labels "Compressed 
Sensing" or "Compressive Sampling" . 

In many practical applications of scientific and engineering signal processing - 
spectroscopy is one example - one can obtain n linear measurements of an object 
X, obtaining data b — Ax: here the rows of the matrix A give the linear response 
functions of the measurement devices. We wish to reconstruct x, knowing only the 
measurements 6, the measurement matrix A, and various a priori constraints on x. 

It could be very useful to be able to do this in the case n < N, allowing us 
to save measurement time or other resources. This seems hopeless, because the 
linear system is underdetermined; but the above lemmas show that there is some 
fundamental soundness to the idea that we can have n < N and still reconstruct. 
We now spell out the consequences of these lemmas in more detail. 

5.1. Reconstruction Exploiting Nonnegativity Constraints. Many practical 
applications, such as spectroscopy and astronomy, the object x to be recovered is 
known a priori to be nonnegative. We wish to reconstruct the unknown x, knowing 
only the linear measurements b = Ax, the matrix A, and the constraint x G M.^. 

Let J{x) be some function of x. Consider the positivity-constrained variational 
problem 

(Posj) minJ(a;) subject to 6 = Ax, a; G R;'^. 

Let posjib, A) denote any solution of the problem instance (Posj) defined by data 
b and matrix A. 

Typical variational functions J include 

• Sparsity: ||a;||^o := ^{i : x > 0}. 

• Size: I'x. 

• negEntropy: X) ^^(j) log(a:;(j)) 

• Energy: Y.x{jf 

This framework contains as special cases the popular signal processing methods of 
maximum entropy reconstruction and nonnegative least-squares reconstruction. 
We conclude the following: 

Corollary 5.1. Suppose that 

/,(AK^) = A(]R^). 
Let xq > and ||a;o||^o < k. For the problem instance defined by 6 == Axq 

pas jib, A) ^ xq. 

In words: under the given conditions on the face numbers, any variational pre- 
scription which imposes nonnegativity constraints will correctly recover the fc-sparse 
solution in any problem instance where such a fc-sparse solution exists. This may 
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seem surprising; as n < A'^, the system of linear equations is underdetermined yet 
we correctly find a sparse solution if it exists. 

Corresponding to this 'strong' statement is a 'weak' statement. Consider the 
following probability measure on fc-sparse problem instances. 

• Choose a random subset / of size k from {1, . . . , N}, by k simple random 
draws without replacement. 

• Set the entries of Xq not in the selected subset to zero. 

• Choose the entries of xq in the selected set / from some fixed joint distri- 
bution ipi supported in (0, l)'"'. 

• Generate the problem instance b — Axq . 

We speak of drawing a fc-sparse random problem instance at random. 
Corollary 5.2. Suppose that for some e e (0, !)• 

fk{ARf)>il-e)-MR^). 
For {b, A) a problem instance drawn at random, as above: 
Proh{posj{b,A) = a;o} > (1 - e). 

In words: under the given conditions on the face lattice, any variational pre- 
scription which imposes nonnegativity constraints will correctly succeed to recover 
the fc-sparse solution in at least a fraction (1 — e) of all fc-sparse problem instances. 
This may seem surprising; since n < N, the system of linear equations is underde- 
termined, and yet, we typically find a sparse solution if it exists. 

Here are some simple applications: 

• In the proportional-dimensional framework, consider triples (fc„ , n, Nn) with 
parameters 5, p. Let A denote an n by Nn matrix having random nullspace 
which is orthant symmetric and generic. 

— If the parameters ^, p name a point 'below' the orthant weak thresh- 
old pw{5\M.^), then for the vast majority of fc„-sparse vectors, any 
variational method will correctly recover the vector. 

— If the parameters 5, p name a point 'below' the orthant strong thresh- 
old jOs((5;R:^), then for large enough n, every A;„-sparse vector can 
be correctly recovered by any variational method imposing positivity 
constraints. 

• In the proportional-dimensional asymptotic, consider triples (fc„,n, iV„) 
with parameters (5, p. Let Aq denote an n — 1 by Nn matrix having iid 
standard normal entries. And let A denote the n by Nn matrix formed by 
adjoining a row of ones to Aq. 

— If the parameters 5, p name a point 'below' the simplex weak thresh- 
old p\y{S;T^~^), then for the vast majority of fc^-sparse vectors, any 
variational method will correctly recover the sparse vector. 

— If the parameters S, p name a point 'below' the simplex strong thresh- 
old ps{S,T^~^), then for large enough n, every fc„-sparse vector can 
be correctly recovered by any variational method imposing positivity 
constraints. 

• Let A denote the n by A'^ partial Fourier matrix built from low frequencies 
and called il in Section \4A\ Every [n/2j -sparse vector will be correctly 
recovered by any variational method imposing positivity constraints. 
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Hence in positivity-constrained reconstruction problems where the object to be 
recovered is zero in most entries - an assumption which approximates the truth in 
many problems of spectroscopy and astronomical imaging [S], we can work with 
fewer than N samples. The above paragraphs show that it matters a great deal 
what matrix A we use. Our preference order: 

fl is better than the random matrix, A is better than a random 
zero-mean matrix A. 

(These results extend and generalize results which were previously obtained by 
the authors in [TTJ, in the case where J{x) — I'x, and by the first author and 
coauthors in [5]; see also Fuchs [TS] and Bruckstcin, Elad, and Ziubulevsky [5].) 

5.2. Reconstruction Exploiting Box Constraints. Consider again the prob- 
lem of reconstruction from measurements b = Ax, but this time assuming the object 
X obeys box-constraints: < x{j) < 1, 1 < j < N. Such constraints can arise for 
example in infrared absorption spectroscopy and in binary digital communications. 
We define the box-constrained variational problem 

{Boxj) min J{x) subject to fe = At, < x{j) < 1, j = 1, . . . ,N. 

Let boxj{b, A) denote any solution of the problem instance {Boxj) defined by data 
b and matrix A. 

In this setting, the notion corresponding to 'sparse' is 'simple'. We say that a 
vector X is k-simple if at most k of its entries differ from the bounds {0, 1}. Here, 
the interesting functions J penalize deviations from simple structure; they include: 

• Simplicity: #{i : x{i) ^ {0, 1}}. 

• Violation Energy: J2^iJ){^ ~ ^U)) 

Corollary 5.3. Suppose that 

Let xo be a fc-simple vector obeying the box constraints < xq < 1. For the 
problem instance defined by 6 = Axq, 

boxj{b, A) ^ xq. 

In words: under the given conditions on the face lattice, any variational pre- 
scription which imposes box constraints, when presented with a problem instance 
where there is a fc-simple solution, will correctly recover the fc-simple solution. 

Corresponding to this 'strong' statement is a 'weak' statement. Consider the 
following probability measure on problem instances having fc-simple solutions. Re- 
call that fc-simple vectors have all entries equal to or 1 except at fc exceptional 
locations. 

• Choose the subset / of fc exceptional entries uniformly at random from the 
set {1, . . . , N} without replacement; 

• Choose the nonexceptional entries to be either or 1 based on tossing a 
fair coin. 

• Choose the values of the exceptional fc entries according to a joint proba- 
bility measure ipi supported in (0, 1)'". 

• Define the problem instance b = Axq . 
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Corollary 5.4. Suppose that for some e G (0, 1). 

Randomly sample a problem instance (6, A) using the method just described. 

P{boxj{b,A)^xo}>{l-e). 

In words: under the given conditions on the face lattice, any variational prescrip- 
tion which imposes box constraints will correctly recover at least a fraction (1 — e) 
of all underdetermined systems generated by the matrix A which have fc-simple 
solutions. 

Here is a simple application. In the proportional-dimensional asymptotic frame- 
work, consider triples (fc„,n, 7V„) with parameters S,p. Let A denote an n by Nn 
matrix having random nuUspace which is orthant symmetric and generic. If the pa- 
rameters S, p name a point 'below' the hypercube weak threshold, then for the vast 
majority of fc„-simple vectors, any variational method imposing box constraints will 
correctly recover the vector. 

In the hypercube case, to our knowledge, there is no phenomenon comparable 
to that which arose in the positive orthant with the special constructions Q and A. 

Consequently, the hypercube weak threshold is the best known general result 
on the ability to undersample by exploiting box constraints. In particular, the 
difference between the weak simplex threshold and the weak hypercube threshold 
has this interpretation: 

A given degree k of sparsity of a nonnegative object is much more 
powerful than that same degree simplicity of a box-constrained 
object. 

Specifically, we shouldn't expect to be able to undersample a typical box- constrained 
object by more than a factor of 2 and then reconstruct it using some garden- variety 
variational prescription. In comparison, the last section showed that we can severely 
undersample very sparse nonnegative objects. 

Because box constraints are of interest in important areas of signal processing, 
it seems that much more attention should be paid to thresholds associated with the 
hypercube. 

6. Additional Proofs 

6.1. Proof of Lemma 12.21 Let &o •= Axq. 

Assume (Survive(A, F, R;^)), that AF is a fc-face of AM;^. General position of 
A implies that AF is a simplicial cone of dimension fc — 1, and that there exists a 
unique x G R^ satisfying Ax = bo, with xq being that solution. We now assume 
3 1/ e Af{A) n FeaSxo(R+) y^ 0. Then 3 e > small enough such that zq := 
xq + ev E M.^. This zq satisfies Azo — bo, in contradiction to the uniqueness 
condition previously stated, therefor Af{A) n FeaSxo{^+) = {0}- 

For the converse direction, assume (Transverse (A, xq, K+ )), that Af{A)r)FeaSxo (R+ ) 
{0}. Assume AF is not a fc-face of AR^, that is AF is interior to AM.^. As A 
projects the interior of R^ to the complete interior of ylM:!J', 3 zq £ M.^ with zq > 
with Azo = bo. The difference v ■= zq - xq yi^ 0, but i/ G A/'(A) n FeaSxo{R+) 
contradicting the Transverse assumption, implying AF is a fc-face of AR;!^. 

D 
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6.2. Proof of Lemma 12.51 This proof follows similarly to that of Lemma l2.2l and 

is omitted. 

6.3. Proof of Lemma 12.61 For points xq on fc-faces of H^ that are also fc-faces 
of M.^ they share the same feasible set 

Feas,„{H^)=Feas^„{R^) 

and by Lemmas O and O the probabilities of (Survive(A, F, Q)) for Q = R^, H^ 
must be equal. Consider a point xo on a fc-face of H^ that is not a fc-face of M.^; 
without loss of generality, due to column exchangeability, let 

( i = l,...,i 
a;o(i) = < 1 i^£+l,...,N -k 
[ 1/2 i = N-k + l,...N 

Then FeaSxgiH'^) ^ {ly : i^i, . . . ,i^e > 0, i^e+i, ■ ■ ■ , ^N-k < 0}. Following the proof 
of Lemma 12.31 condition (Transverse(A,xo, i/^)) can be restated as 

The only vector c satisfying 
CR n ny^^r, nN\ ) (^'^c), > 0, i = l,...,£, 

(6.1) (IneqiJ ) { ^B^c),<0, i = i + 1, . . . , N ~ k, 

is the vector c = 

where B is the orthogonal complement of A. 

Orthant symmetry of B states that the sign of [B^ c)i is equiprobable; conse- 
quently, the probability of the event named (j6.1|) is independent of £, and is in fact 
equal to the probability of the event named in (|2.3p . 

D 

6.4. Proof of Corollary 14.11 The result is a corollary of [9, Theorem 3, pp. 56]. 
However, it may require effort on the part of readers to see this, so we select the 
key step from the proof of Theorem 3, [9l Lemma 2, pp. 63], and use it directly 
within the framework of this paper. 

As n is odd, write n = 2m + 1 where m is an integer. The range of the matrix 
O is the span of all Fourier frequencies from to TT{m — 1)/N. In accord with 
terminology in electrical engineering, this space of vectors with be called the space 
of Lowpass sequences £(m) . The nuUspace of il is the span of all Fourier frequencies 
from TTTn/N to tt/N. It will be called the space of Highpass sequences T-i{m). 

We have the following: 

Lemma 6.1. [9] Every sequence in Ti{m) has at least m negative entries. 

Recall condition (Transverse(ri, a;o,M:^)). If xq has k nonzeros, then vectors in 
FeaSxa{R^) have at most k negative entries. But vectors in A/'(r2) = H(m) have at 
least m negative entries. Therefore, if m > fc, (Transverse(ri,a:o,R^)) must hold. 

By Lemma 2.2, every (Survive(f7, F, K+ )) must hold for every fc-face with k < m. 
Hence /fc_i(17K^) = /fc_i(K^), for fc < m = i(n - 1). D 

6.5. Proof of Theorem 14.11 The Theorem is an immediate consequence of the 
following identity. 

Lemma 6.2. Suppose that the row vector 1 is not in the row span of A. Then 

/fe(iM^) = A_i(Ar^-i),0<fc<n. 
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Proof. We observe that there is a natural bijection between fc-faces of M.^ and the 
k— 1-faces of T^^^. The k— 1-faces of T^^^ are in bijection with the corresponding 
support sets of cardinaHty k: i.e. we can identify with each fc-face F the union / of 
all supports of all members of the face. Similarly to each support set / of cardinality 
k there is a unique fc-face F of M:^ consisting of all points in M;^ whose support lies 
in /. Composing bijections F -^^ I -^^ F we have the bijection F -i^ F. 

Concretely, let xq be a point in the relative interior of some k — 1-face F of 
rpN-i ^ Then Xq has k nonzeros. xq is also in the relative interior of the fc-face F of 
M.^ Conversely, let y^ be a point in the relative interior of some fc-face of M;'^; then 
Xq — {l'yo)~^yo is a point in the relative interior of a fc — 1-face of T^~^ . 

The last two paragraphs show that for each pair of corresponding faces (F, F), 
we may find a point xq in both the relative interior of F and also of the relative 
interior of F . For such xq, 

Feas,o(R^) = Feas,„(T^-i) -)- Zm(xo). 

Clearly M{A) C\ lin{xo) — {0}, because I'xo > 0. We conclude that the following 
are equivalent: 

(Transverse(A,a:o,r^"i)) 7V(A) n Fcas^^jT^-i) = {0}. 

(Transverse(i,a;o,M^)) J\f{A) n Feas^jM^) = {0}. 

Rephrasing [llj , the following are equivalent for xq a point in the relative interior 

of F: 

(Survive(A, F, T^-i)) AF is a fc - 1-facc of AT^-^, 

(Transverse(A, xq, T^-i)) Af{A) n Feas^jT^-^) = {0}. 

We conclude that for two corresponding faces F, F, the following are equivalent: 

(Survive(A,F,r^-i)): AF is a fc - 1-face of AT'^-\ 

(Survive(i, F, Rf)): AF is a fc-face of AR^ . 
Combining this with the natural bijection F ^^ F, the lemma is proved. D 
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